Skip to content

Introduce a new callback based bridge worker #955

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

FranzBusch
Copy link
Contributor

Motivation

The bridge code allows other languages to interop with the Core Rust SDK through the C-FFI layer. Sometimes those languages also want to bring their own custom gRPC implementation from their languages ecosystem.

Modification

This PR introduces a new callback based worker client that is also exposed through the bridge header. This allows other languages to use their gPRC client to serve the requests from the Core SDK.

# Motivation

The bridge code allows other languages to interop with the Core Rust SDK through the C-FFI layer. Sometimes those languages also want to bring their own custom gRPC implementation from their languages ecosystem.

# Modification

This PR introduces a new callback based worker client that is also exposed through the bridge header. This allows other languages to use their gPRC client to serve the requests from the Core SDK.
@FranzBusch FranzBusch requested a review from a team as a code owner July 9, 2025 13:01
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reviewed API, didn't get to implementation yet. But I am starting to think we need to be able to replace the gRPC implementation, not the worker's client implementation. There are other benefits to our other SDKs to be able to replace the low-level gRPC calls only.

* Function pointer for RPC callbacks delivering a `(success, failure)`
* pair as non-owning `ByteArrayRef`s to Rust.
*/
typedef void (*TemporalCoreWorkerClientCallbackTrampoline)(void *user_data,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
typedef void (*TemporalCoreWorkerClientCallbackTrampoline)(void *user_data,
typedef void (*TemporalCoreWorkerClientResultCallback)(void *user_data,

Pedantic, but we have not traditionally called this a trampoline in this C header, can it just be called a callback?

void *user_data;
void (*poll_workflow_task)(void *user_data,
struct TemporalCoreByteArrayRef request,
const struct TemporalCoreCallbackBasedWorkerClientPollOptions *poll_opts,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are poll options expected to be used?

/**
* Interop function pointer table implementing the RPC logic of the [temporal_sdk_core::WorkerClient] trait via callbacks.
*/
typedef struct TemporalCoreCallbackBasedWorkerClientRPCs {
Copy link
Member

@cretz cretz Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, thinking on this, I wonder if we should just have a general lower-level callback-based client and nothing specific to workers. So basically a C abstraction for gRPC client.

So basically have a temporal_core_client_from_callbacks_new that accepts some options including the single function called back for every gRPC request (so RPC name is one of the params), and it returns a pointer to TemporalCoreClient (or rather a struct that has that or fail as mutually exclusive). Now that client can be used for temporal_core_worker_new can be passed to temporal_core_client_free, etc. The API would be cleaner/simpler and the implementer does not have to be aware of retry or worker details or any of that.

I know we had mentioned this approach at #883, but instead at #924 we're exposing the single worker client trait. I think it'd be wiser to make a C bridge for gRPC client in general. This would allow our SDKs to let people just replace the gRPC implementation at client creation time instead of worker creation time. It would also allow our SDKs to provide gRPC interception if they wanted (by delegating to another client), though we could make a bit of sugar to make delegating to the default implementation easier at that time.

Thoughts @FranzBusch and @Sushisource?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I generally agree. I think it's my bad for forgetting that we wanted to do that originally. Because we had talked about it before. I think the main issue is just that it's a decent chunk of refactoring I think. Plumbing through the underlying generic thing will require touching a bunch of spots in the client crate, and allowing plugging in a https://docs.rs/tower-service/0.3.3/tower_service/trait.Service.html (I believe) into the connection code here

let channel = Channel::from_shared(self.target_url.to_string())?;

The type-level tomfoolery is probably significant. So, honestly, I'm kinda fine with this for now if it allows us to proceed. Alternatively, @FranzBusch, if you're not in a massive rush, and you're interested in taking a crack at that, that would be fantastic. Otherwise I'm not sure we'd have time to fit that in super soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time looking to see if we can drop the callback based substitution one layer down to general Client level. Overall it seems like a good alternative approach but it looks like it would require a significant amount of refactoring to get us there. The approach I took here with just allowing the worker client to be substituted was significantly more straight forward since the protocol already existed. It was also enough for our use-case since I only cared about replacing the worker client and not the overall client.

I would love to proceed with what @Sushisource suggested here and proceed with the strategy of this PR and we can track the alternative approach in an issue for further down the road. If you all are fine with this I will tackle the inline feedback on this PR.

Copy link
Member

@cretz cretz Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, if it means we can reduce this bridge complexity and increase benefits for other things, I may be able to carve off time to see if I can confirm/deny replacing the gRPC call with a C call in clients is not too bad. May be able to replace the tower service in some way. At least for me, if I can see it is non-trivial, it would make it easier to justify this worker-only approach. Regardless, if I can't confirm/deny in a timely manner, agreed we can move forward on this.

Comment on lines +706 to +708
const struct TemporalCoreCallbackBasedWorkerClientPollOptions *poll_opts,
TemporalCoreWorkerClientCallbackTrampoline callback,
void *callback_ud);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const struct TemporalCoreCallbackBasedWorkerClientPollOptions *poll_opts,
TemporalCoreWorkerClientCallbackTrampoline callback,
void *callback_ud);
const struct TemporalCoreCallbackBasedWorkerClientPollOptions *poll_options,
TemporalCoreWorkerClientCallbackTrampoline callback,
void *callback_user_data);

Pedantic, but would like not to abbreviate things that are exposed in the header

* - `worker`: on success, a heap-allocated [Worker] pointer; null on error.
* - `fail`: on error, a C string (UTF-8) describing the failure; null on success.
*/
struct TemporalCoreWorkerOrFail temporal_core_new_callback_based_worker_client(const struct TemporalCoreCallbackBasedWorkerClientRPCs *client_rpcs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
struct TemporalCoreWorkerOrFail temporal_core_new_callback_based_worker_client(const struct TemporalCoreCallbackBasedWorkerClientRPCs *client_rpcs,
struct TemporalCoreWorkerOrFail temporal_core_callback_based_worker_new(const struct TemporalCoreCallbackBasedWorkerClientRPCs *client_rpcs,

or

Suggested change
struct TemporalCoreWorkerOrFail temporal_core_new_callback_based_worker_client(const struct TemporalCoreCallbackBasedWorkerClientRPCs *client_rpcs,
struct TemporalCoreWorkerOrFail temporal_core_worker_callback_based_new(const struct TemporalCoreCallbackBasedWorkerClientRPCs *client_rpcs,

Pedantic, but worth being clear this is creating a worker not a client

*/
struct TemporalCoreWorkerOrFail temporal_core_new_callback_based_worker_client(const struct TemporalCoreCallbackBasedWorkerClientRPCs *client_rpcs,
const struct TemporalCoreWorkerOptions *options,
const struct TemporalCoreCallbackBasedWorkerClientConfig *client_config,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic, but would be ok putting all other non-runtime parameters in this structure instead of as separate parameters. And call it TemporalCoreCallbackBasedWorkerOptions instead of Config. This is consistent with what we've done elsewhere I think. But not a big deal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants